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Abstract 

We study the design of local algorithms for massive graphs. A local algorithm is one 
that finds a solution containing or near a given vertex without looking at the whole graph. 
We present a local clustering algorithm. Our algorithm finds a good cluster — a subset of 
vertices whose internal connections are significantly richer than its external connections — 
near a given vertex. The running time of our algorithm, when it finds a non-empty local 
cluster, is nearly linear in the size of the cluster it outputs. 

Our clustering algorithm could be a useful primitive for handling massive graphs, such as 
social networks and web-graphs. As an application of this clustering algorithm, we present a 
partitioning algorithm that finds an approximate sparsest cut with nearly optimal balance. 
Our algorithm takes time nearly linear in the number edges of the graph. 

Using the partitioning algorithm of this paper, we have designed a nearly-linear time 
algorithm for constructing spectral sparsifiers of graphs, which we in turn use in a nearly- 
linear time algorithm for solving linear systems in symmetric, diagonally-dominant matrices. 
The linear system solver also leads to a nearly linear-time algorithm for approximating the 
second-smallest eigenvalue and corresponding eigenvector of the Laplacian matrix of a graph. 
These other results are presented in two companion papers. 



"This paper is the first in a sequence of three papers expanding on material that appeared first under the title 
"Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems" ST03 . 
The second paper, "Spectral Sparsification of Graphs" ST08b contains further results on partitioning graphs, 
and applies them to producing spectral sparsifiers of graphs. The third paper, "Nearly-Linear Time Algorithms 
for Preconditioning and Solving Symmetric, Diagonally Dominant Linear Systems" [ST08a] contains the results 
on solving linear equations and approximating eigenvalues and eigenvectors. 

This material is based upon work supported by the National Science Foundation under Grant Nos. 0325630, 
0634957, 0635102 and 0707522. Any opinions, findings, and conclusions or recommendations expressed in this 
material are those of the authors and do not necessarily reflect the views of the National Science Foundation. 
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1 Introduction 



Given a vertex of interest in a massive graph, we would like to find a small cluster around that 
vertex, in time proportional to the size of the cluster. The algorithm we introduce will solve this 
problem while only examining vertices near the initial vertex, under some reasonable notion of 
nearness. We call such an algorithm a local algorithm. 

Our local clustering algorithm provides a very powerful primitive for the design of fast 
graph algorithms. In Section [3] of this paper, we use it to design the first nearly-linear time 
algorithm for graph partitioning that produces a partition of nearly-optimal balance among 
those approximating a target conductance. In the papers [ST08b] and |ST0 8aj. we proceed 
to use this graph partitioning algorithm to design nearly-linear time algorithms for sparsifying 
graphs and for solving symmetric, diagonally-dominant linear systems. 

1.1 Local Clustering 

We say that a graph algorithm is a local algorithm if it is given a particular vertex as input, and 
at each step after the first only examines vertices connected to those it has seen before. The 
use of a local algorithm naturally leads to the question of in which order one should explore 
the vertices of a graph. While it may be natural to explore vertices in order of shortest-path 
distance from the input vertex, such an ordering is a poor choice in graphs of low-diameter, such 
as social network graphs [LH08] . We suggest first processing the vertices that are most likely to 
occur in short random walks from at the input vertex. That is, we consider a vertex to be near 
the input vertex if it is likely to appear in a short random walk from the input vertex. 

In Section [3j we use a local graph exploration process to find a cluster that is near the input 
vertex. Following Kannan, Vempala and Vetta |KVV04| . we say that a set of vertices is a good 
cluster if it has low conductance; that is, if it has many more external than internal edges. We 
give an efficient local clustering algorithm, Nibble, that runs in time proportional to the size 
of the cluster it outputs. Although our algorithm may not find a local cluster for some input 
vertices, we will show that it is usually successful. In particular, we prove the following theorem: 
There exists a constant a > such that for any target conductance (ft and any cluster Cq of 
conductance at most a • (ft 2 /log 3 n, when given a random vertex v sampled according to degree 
inside Cq, Nibble will return a cluster C mostly inside Cq and with conductance at most (ft, 
with probability at least 1/2. 

The local clustering algorithm Nibble makes a novel use of random walks. For a positive 
integer t, suppose <pt,v is the probability distribution of the i-step random walk starting at v. 
As the support of pt,v — the set of nodes with positive probability — could grow rapidly, Nibble 
maintains a truncated version of the distribution. At each step of the truncated random walks, 
Nibble looks a for cluster among only nodes with high probability. The truncation is critical 
to ensure that the clustering algorithm is output sensitive. It guarantees that the size of the 
support of the distribution that Nibble maintains is not too much larger than the size of the 
cluster it produces. The cluster that Nibble produces is local to the starting vertex v in the 
sense that it consists of nodes that are among the most favored destinations of random walks 
starting from v. 

By using the personal PageRank vector [PBMW98J to define nearness, Andersen, Chung and 
Lang [ACL06], have produced an improved version of our algorithm Nibble, which they call 
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PageRank-Nibble. Following this work, other local algorithms have been designed by Andersen 
et. al. ABC + 07] for approximately computing Personal PageRank vectors, by Andersen |And08] 



for finding dense subgraphs and by Andersen, Chung and Lang ACL07J for partitioning directed 
graphs. 

1.2 Nearly Linear-Time Algorithms 

Our local clustering algorithm provides a powerful tool for designing fast graph algorithms. In 
this paper and its two companion papers, we show how to use it to design randomized, nearly 
linear-time algorithms for several important graph-theoretic and numerical problems. 

The need for algorithms whose running time is linear or nearly linear in their input size has 
increased as algorithms handle larger inputs. For example, in circuit design and simulation, an 
Intel Dual Core Itanium processor has more than one billion transistors, which is more than 100 
times the number of transistors that the Pentium had in 2000 [Cor05]; in scientific computing, 



one often needs to solve linear systems that involve hundreds of millions of variables SLM + 02 
in modern information infrastructure, the web has grown into a graph of hundreds billions of 
nodes }GS05j . As a result of this rapid growth in problem size, what used to be considered an 
efficient algorithm, such as a 0(n 15 )-time algorithm, may no longer be adequate for solving 
problems of these scales. Space complexity poses an even greater problem. 

Many basic graph-theoretic problems such as connectivity and topological sorting can be 
solved in linear or nearly-linear time. The efficient algorithms for these problems are built 
on linear-time primitives such as Breadth-First-Search (BFS) and Depth-First-Search (DFS). 
Minimum Spanning Trees (MST) and Shortest-Path Trees are examples of other commonly 
used nearly linear-time primitives. We hope to build up the library of nearly-linear time graph 
algorithms that may be used as primitives. While the analyzable variants of the algorithms we 
present here, and even their improved versions by Andersen, Chung and Lang [ACL06], may not 
be immediately useful in practice, we believe practical algorithms may be derived from them by 
making less conservative choices of parameters. 

Our local clustering algorithm provides an exciting new primitive for developing nearly linear- 
time graph algorithms. Because its running time is proportional to the size of the cluster it 
produces, we can repeatedly apply it remove many clusters from a graph, all within nearly- 
linear time. 

In the second part of this paper, we use Nibble as a subroutine to construct a randomized 
graph partitioning algorithm that runs in nearly-linear time. To the best of our knowledge, 
this is the first nearly linear-time partitioning algorithm that finds an approximate sparsest 
cut with approximately optimal balance. In our first companion paper [STOSbj . we apply this 
new partitioning algorithm to develop a nearly-linear-time algorithm for producing spectral 
sparsifiers of graphs. We begin that paper by extending the partitioning algorithm of this paper 
to obtain a stronger guarantee on its output: if it outputs a small set, then the complement 
must be contained in a subgraph whose conductance is higher than the target. 

2 Clusters and Conductance 

Let G = (V, E) be an undirected graph with V = {1, . . . ,n}. A cluster of G is a subset of V 
that is richly intra-connected but sparsely connected with the rest of the graph. The quality of a 
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cluster can be measured by its conductance, the ratio of the number of its external connections 
to the number of its total connections. 

We let d(i) denote the degree of vertex i. For S C V, we define n(S) = ^2 iGS d(i) (often 
called the volume of S). So, n(V) = 2\E\. Let E(S,V — S) be the set of edges connecting a 
vertex in S with a vertex in V — S. We define the conductance of a set of vertices S, written 
$ (5) by 

def \E(S,V-S)\ 
{S) min( M (5),M(^-5))' 
The conductance of G is then given by 

$ G = f min $ (5) . 

scv 

We sometime refer to a subset S 1 of as a cu£ of G and refer to (S", V — S) as a partition of 
G. The balance of a cut £ or a partition (S, V — S) is then equal to 

bal(S) =mm{fi(S) ,n(V - S))/fi(V) . 

We call 5 a sparsest cut of G if $ (5) = $ G and /i (5) /// (V) < 1/2. 

In the construction of a partition of G, we will be concerned with vertex-induced subgraphs 
of G. However, when measuring the conductance and volumes of vertices in these vertex-induced 
subgraphs, we will continue to measure the volume according to the degrees of vertices in the 
original graph. For clarity, we define the conductance of a set S in the subgraph induced by 
A C V by 

\E(S,A-S)\ 



and 



min (h(S),(j, (A - S)Y 



For convenience, we define (0) = 1 and, for \A\ = 1, = 1. 

For A C V, we let G(^4) denote the subgraph of G induced by the vertices in A. We introduce 
the notation G[A] to denote graph G(A) to which self- loops have been added so that every vertex 
in G[A] has the same degree as in G. Each self-loop adds 1 to the degree. We remark that if 
G(A) is the subgraph of G induced on the vertices in A, then 

So, when we prove lower bounds on we obtain lower bounds on Qq^. 

Clustering is an optimization problem: Given an undirected graph G and a conductance 
parameter, find a cluster C such that <3? (C) < <j), or determine no such cluster exists. The 
problem is NP-complete (see, for example l,R9i)j or [SS06] ). But, approximation algorithms ex- 
ist. Leighton and Rao [LR99] used linear programming to obtain 0(log n)-approximations of the 
sparsest cut. Arora, Rao and Vazirani [ARV04J improved this to 0{^J\og n) through semi-definite 
programming. Faster algorithms obtaining similar guarantees have been constructed by Arora, 
Hazan and Kale [AHK04J, Khandekar, Rao and Vazirani [KRV06J, Arora and Kale [A K07 , and 
Orecchia, Schulman, Vazirani, and Vishnoi |OSVVQ8] . 
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2.1 The Algorithm Nibble 

The algorithm Nibble works by approximately computing the distribution of a few steps of the 
random walk starting at a seed vertex v. It is implicit in the analysis of the volume estimation 
algorithm of Lovasz and Simonovits [LS93J that one can find a cut with small conductance from 
the distributions of the steps of the random walk starting at any vertex from which the walk 
does not mix rapidly. We will observe that a random vertex in a set of low conductance is 
probably such a vertex. We then extend the analysis of Lovasz and Simonovits to show one can 
find a cut with small conductance from approximations of these distributions, and that these 
approximations can be computed quickly. In particular, we will truncate all small probabilities 
that appear in the distributions to 0. In this way, we reduce the work required to compute our 
approximations . 

For the rest of this section, we will work with a graph G = (V, E) with n vertices and m 
edges, so that \x (V) = 2m. We will allow some of these edges to be self-loops. Except for the 
self-loops, which we allow to occur with multiplicities, the graph is assumed to be unweighted. 
We will let A be the adjacency matrix of this graph. That is, 



A(u,v) 



1 if (u, v ) G E and u ^ v 

k if u = v and this vertex has k self-loops 

otherwise. 



We define the following two vectors supported on a set of vertices S: 

Xs(u) 



1 for u G S, 
otherwise, 

d(u)/n(S) for u G S, 
otherwise. 



We will consider the random walk that at each time step stays at the current vertex with 
probability 1/2, and otherwise moves to the endpoint of a random edge attached to the current 
vertex. Thus, self-loops increase the chance the walk stays at the current vertex. For example, 
if a vertex has 4 edges, one of which is a self-loop, then when the walk is at this vertex it has a 
5/8 chance of staying at that vertex, and a 1/8 chance of moving to each of its 3 neighbors. 

The matrix realizing this walk can be expressed by M = (AD^ 1 + I)/2, where d(i) is the 
degree of node i, and D is the diagonal matrix with diagonal entries (d(l), . . . , d(n)). Typically, 
a random walk starts at a node v. In this case, the distribution of the random walk at time t 
evolves according to pt = M l Xv 

We note that ipy is the steady-state distribution of the random walk, and that ipg is the 
restriction of that walk to the set S. 

We will use the truncation operation defined by 



Me («) 




if p(u) > d(u)e, 
otherwise. 
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Our algorithm, Nibble, will generate the sequence of vectors starting at Xv by the rules 




if t = 0, 
otherwise, 



(1) 

(2) 



That is, at each time step, we will evolve the random walk one step from the current density, 
and then round every qt(u) that is less than d(u)e to 0. Note that qt and Tt are not necessarily 
probability vectors, as their components may sum to less than 1. 

In the statement of the algorithm and its analysis, we will use the following notation. For a 
vector p, we let Sj(p) be the set of j vertices u maximizing p(u)/d(u), breaking ties lexicograph- 
ically. That is, Sj(p) = {7r(l), . . . where n is the permutation such that 

j>(7r(i))/d(7r(i)) > p(?r(i + l))/d(ir{i + 1)) 
for all i, and ir(i) < n(i + 1) when these two ratios are equal. We then set 

Xj (p) = [i (Sj (p) ) = d(u) . 



ueSj(p) 



Note that X n (p) always equals 2m. 

Following Lovasz and Simonovits |LS90| . we set 



I(p,x) 



max > w(u)p(u). 
we[o,i] n ^ 



(3) 



Y] w(u)d(u)=x 



u&V 



This function I(p, •) is essentially the same as the function h defined by Lovasz and Simonovits — 
it only differs by a linear transformation. 

We remark that for x = \j(p), I(p,x) = p(Sj(p)), and that I{p,x) is linear in x between 
these points. Finally, we let I x (p,x) denote the partial derivative of I(p,x) with respect to x, 
with the convention that for x = Xj (p) , 

Ix(p,x) = lira I x (p,x - 5) = p(ir(j))/d(n(j)), 
o— >u 

where ir is the permutation specified above so that tv(J) = Sj(p) — Sj-i(p). 

As p{n{i)) / d{ir(i)) is non-increasing, I x (p,x) is a non-increasing function in x and I(p,x) is 
a concave function in x. 

During the course of our exposition, we will need to set many constants, which we collect 
here for convenience. For each, we provide a suitable value and indicate where it is first used in 
the paper. 



constant 


value 


where first used 


Cl 


200 


<m 


C2 


280 


(USD 


C3 


1800 


(USD 


C 4 


140 


Nibble, line C.4 


C5 


20 


Definition |2. 11| 


C6 


60 


Definition |2. 11| 
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The following is an exhaustive list of the inequalities we require these constants to satisfy. 



c 2 > 2c 4 
c 6 > 2c 5 

C3 > 8C5 
C4 > 4C5 

11 11 

2c 6 c 3 2c 5 c 6 ~ c 4 

1 6 1 

2c 5 ~~ 5c 6 ci 

1 1 4c 6 1 1 
->— + — + + • 

5 C5 3c3 2ci 2c2 

Given a <p, we set constants that will play a prominent role in our analysis: 



dcf 



dcf 



(1st. 



\log 2 (»(V) /2)1, 

In (ci(£ + 2)VmOO /2) 

:or < /i 
+ and 



* h d = /iti,for Q<h<e + 1, 

dcf 



dcf 



1 



C 2 {t + 2)tlast' 



Note that 



2 

io g 3 M (y) 



(4) 
(5) 
(6) 
(7) 

(8) 
(9) 
(10) 

(11) 
(12) 

(13) 
(14) 
(15) 
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C = Nibble (G>, </>, b) 




where v is a vertex 




o < 4> < i 




U lb d pUolLlvc llllcgCI . 




1. Set 




e = \l(o\{l + 2)t,„ e +2 b ) 


(161 


2. Set g = Xv and r = [q ] e . 




3. For i = 1 to ti ast 




(a) Set q t = Mr t -i 




(b) Set r t = [qt} £ - 




(c) If there exists a j such that 








(C.2) A J ( % )<(5/6)^(y), 




(C.3) 2 b < Aj(gt), and 




(C.4) 2 b ) > l/c 4 (^ + 2)2 b . 




then return C = Sj(qt) and quit. 




4. Return C = 0. 





Condition (C.l) guarantees that the set C has low conductance. Condition (C.2) ensures 
that it does not contain too much volume, while condition (C.3) ensures that it does not contain 
too little. Condition (C.4) guarantees that many elements of C have large probability mass. 
While it would be more natural to define condition (C.4) as a constraint on I x (qt, ^j(lt)) instead 
of I x (qt: 2 fc ), our proof of correctness requires the latter. 

In the rest of this section, we will prove the following theorem on the performance of Nibble. 

Theorem 2.1 (Nibble). Nibble can be implemented so that on all inputs, it runs in time 
0(2 b (log 6 m)/(/> 4 ). Moreover, Nibble satisfies the following properties. 

(N.l) When C = Nibble(G, v, (f), b) is non-empty, 

<&{C)<4> and n (C) < (5/6)/i (V) . 

(N.2) Each set S satisfying 

M(S)<(2/3)/i(V0 and <D (S) < h(cf>) 
has a subset S 9 such that 
(N.2.a) fi(S 9 ) >n(S) /2, and 

(N.2.b) veS 9 andC = Nibble(G, v, (j), b) ^ imply n{CnS)> 2 b ~ 1 . 

(N.3) The set S 9 may be partitioned into subsets Sq, . . . , Sf such that if v £ then the set C 
output by Nibble(G, v, cj>, b) will not be empty. 
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2.2 Basic Inequalities about Random Walks 

We first establish some basic inequalities that will be useful in our analysis. Readers who are 
eager to see the analysis of Nibble can first skip this subsection. Suppose G = (V, E) is an 
undirected graph. Recall M = (AD~ 1 + J)/2, where A is the adjacency matrix of G. 

Proposition 2.2 (Monotonicity of Mult by M). For all non-negative vectors p, 

ll^-^IL^II^HL- 

Proof. Applying the transformation z = D~ 1 p, we see that it is equivalent to show that for all z 

WD^MDzII < llzll . 

M Hoc — II MOO 

To prove this, we note that D~ l MD = D~ 1 (AD~ 1 + I)D/2 = M T , and the sum of the entries 
in each row of this matrix is 1. □ 

Definition 2.3. For a set S C V , we define the matrix D$ to be the diagonal matrix such that 
Ds(u,u) = 1 if u £ S and otherwise. 

Proposition 2.4. For every S C V , all non-negative vectors p and q, and every t > 1, 

p T {D s M) t q < p T M l q. 

Proof. For t = 1, we observe 

p T (M)q = p T ((D s + D § )M)q = p T (D s M)q + p T (D § M)q > p T (D s M)q, 

as p, q, D§, and M are all non-negative. The proposition now follows by induction. □ 
Proposition 2.5 (Escaping Mass). For all t > and for all S C V, 

l T (L> 5 M)Vs > 1 - t$ v (S)/2. 

Proof. Note that Mips is the distribution after a single-step walk from a random vertex in S 
and 1 T Ds(Mips) is the probability that the walk stays inside S. Thus, l T (DsM)'^s is the 
probability that a i-step walk starting from a random vertex in S stays entirely in S. 
We first prove by induction that for all t > 0, 

||L»- 1 (D 5 M)Vs|L<l//x(-S'). (17) 

The base case, t = 0, follows from the fact that ll-D -1 ^^!!^ = 1//j(S). To complete the 
induction, observe that if £ is a non-negative vector such that ||L' _1 x|| oo < l/fi(S), then 

\\D~ 1 {D s M)x\\ oo = WDsD^MxW^ < WD^MxW^ < H-CT^IL < 1 /^(S) , 

where the second-to-last inequality follows from Proposition 12.21 
We will now prove that for all t, 

l T (Z) s M)Vs - l T {D s M) t+1 ip s < $v{S)/2, 
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from which the proposition follows, as l T ips = 1- 
Observing that 1 T M = 1 T , we compute 



l T (L> 5 M)Vs - l T (D s M) t+1 iP s 
= l T (I-D s M)(D s M) t ^ s 
= 1 T (M - D s M)(D s M) t ip s 
= 1 T (I - D s )M{D s M) t ^s 
= x T sM(DsMf^s 
= (l/2)x|(I + AD-^DsMf^s 

= (1/2) X |(A J D- 1 )( J D 5 M)V5 (as j^ID s = 0) 

< (1/2) \E(S, V — 5)| WD-^DsMfiPsW^ 

< l lE{S ^~ S)l (by inequality 
2 /i(5) 

< $v(S)/2. 

□ 



2.3 The Analysis of Nibble 

Our analysis of Nibble consists of three main steps. First, we define the sets S 9 mentioned in 
Theorem 12.11 and establish property (N.2). We then refine the structure of S 9 to define sets 5^ 
and prove property (N.3). The sets S 9 and are defined in terms of the distributions of random 
walks from a vertex in 5, without reference to the truncation we perform in the algorithm. We 
then analyze the impact of truncation used in Nibble and extend the theory of Lovasz and 
Simonovits [LS93] to truncated random walks. 



Step 1: S 9 and its properties 

Definition 2.6 (S 9 ). For each set S C V , we define S 9 to be the set of nodes v in S such that 
for all t < ti ast , 

xlM'x^MasMS). 

Note that Xg^Xv denotes the probability that a t-step random walk starting from v ter- 
minates outside S. Roughly speaking, S 9 is the set of vertices v £ S such that a random walk 
from v it is reasonably likely to still be in S after ti as t time steps. We will prove the following 
bound on the volume of S 9 . 

Lemma 2.7 (Volume of S 9 ). 

M(5 ff )>//(5)/2. 

Proof. Let SCI/, and let D$ be the diagonal matrix such that Dg(u,u) = 1 if u G 5 and 
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otherwise. For t > 0, 



x'gM'x, = (1 - xsV M X v 
= l T X v - XsM l Xv 
= 1 - XsM'x, 

< 1 - l T (D s M) t x v , by Proposition [2 

< 1 - l T (DsM) tlast Xv, 

as \ T {DsM) l Xv is a non-increasing function of t. Define 

S' = {v:l- l T (D s M)**«*Xv < tiasMS)} . 

So, S' C S 9 , and it suffices to prove that \i (S") > fi (5) /2. 
Applying Proposition 12.51 we obtain 

t las MS)/2 > 1 - ^{DsMf^s 

= E^( 1 - lT ( D s M ) tlast x.) 

> Yj ^JTfitiasMS), by the definition of S' 

v&S-S' ^ ^ ' 

-^ S - S \ as MS). 



So, we may conclude 



u(S- S') 1 



2' 

from which the lemma follows. □ 

We now prove the following lemma, which says that if Nibble is started from any v £ S 9 
with parameter b and returns a non-empty set C, then fi(C H S) > 2 6_1 . 

Lemma 2.8 (N2). Let S <^ V be a set of vertices such that &(S) < fi(<p). If Nibble is run with 
parameter b, is started at a v G S 9 , and outputs a non-empty set C , then fj,(C D S) > 2 

Proof. For v £ S 9 , let q t be given by ^ and ©. Then, for t < ti ast , 



xht < X T § M t Xv < <S>(S)t last < hWlast 



< 



1 



c 2 (^ + 2)' 



where the second inequality follows from the definition of S 9 . 

Let t be the index of the step at which the set C is generated. Let f be the least integer such 
that Xji(qt) > 2 b . Condition (C.3) implies f < j. As I x is non-increasing in its second argument 
and constant between 2 b and A,v(g t ) ; Condition (C.4) guarantees that for all u £ Sj>(qt), 



qt(u)/d(u) > l/ Ci (£ + 2)2 b . 



11 



Thus, 



/i(^(? t )n5) = Yl Yl c i (e + 2)2 b q t ( 

u£S j ,(qt)nS n£S j /(gt)n5 



<c A {i + 2)2\ X T sqt )< C ^ + 2)2 <2^ 



c 2 (£ + 2) 



by ®. So, (5j/( ft ) nS)> 2 6 - 1 , and, as f < j, 

PL (Sjiqt) nS)>fi (S f (q t ) ns)> 2 b ~\ 



□ 



Step 2: Refining S 9 

Before defining the sets S?, we first recall some of the facts we can infer about the function / 
from the work of Lovasz and Simonovits. These facts will motivate our definitions and analysis. 
In the first part of the proof of Lemma 1.4 of |LS90j . Lovasz and Simonovits prove 

Lemma 2.9. For every non-negative vector p and every x, 

I{Mp,x) < I(p,x). (18) 

For each pt, I(pt,x) is a concave function that starts at (0,0) and goes to (//(V),l). 
Lemma 12.91 savs that for each t, the curve defined by I(pt+\,-) lies below the curve defined 
by I(j>t, •)• I n particular, 

Vx,I(pt +1 ,x) < Ifa, x). (19) 

If none of the sets Sj(pt+i) has conductance less than <j), then Lovasz and Simonovits prove 
a bound on how far below I(pt, •) the curve of I{pt+i } •) must lie. The following Lemma is a 
special case of Lemma 1.4 of |LS93| . restricted to points x of the form Xj(Mp). Lovasz and 
Simonovits [LS90] claim that the following is true for all x, but point out in the journal version 
of their paper [LS93] that this claim was false. Fortunately, we do not need the stronger claim. 

Lemma 2.10. For any non-negative vector p, if $>(Sj(Mp)) > <\>, then for x = Xj(Mp), 

I(Mp, x) < - (l(p, x - 2<0) + l(p,x + 2(0)) , 

where x denotes min(x, 2m — x). 

The mistake in [LS90J is the assertion in the beginning of the proof that the inequality holds 
for all x if it holds for all x of form Xj(Mp). 

When this lemma applies, one may draw a chord across the curve of I(pt,-) around x of 
width proportional to (j), and know that I(pt+i,x) lies below. Thus, we know that if none of the 
sets Sj(pt) has conductance less than <fi, then the curve I(pt, ■) will approach a straight line. On 
the other hand, Proposition 12.51 will tell us that some point of I(pt la3t , •) lies well above this line 
(see Lemma l2.14p . 

We will now define the sets S? for b = 1, . . . , £, simultaneously with two quantities — h v and 
Xh, where h v is such that Nibble will stop between iterations t^—i and t^ v and x\ lv is the x- 
coordinate of a point that may be shown in Lemmas 12.141 and 12.171 to contradict the conclusion 
of Lemma 12.101 and thereby enable us to find a set of low conductance. 
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Definition 2.11 (xt, K and S%). Given a v £ S 9 , let p t = M t %v For < h < £ + 1, define 
Xh(v) to be the real number such that 

I(p th ,x h (v)) 



c 5 (£ + 2)' 

We write x^ instead of Xh{v) when v is clear from context. Define 



h v 



£+1 if x e (v) > 2m/c 6 (£ + 2), and 

mm{h : Xh < 2xh-i} otherwise. 



We define 



S 9 ={v:x hv ^(v)<2}, 

and for b = 1, . . . ,£, we define 

SI = {v:x hv _ 1 (v)£[2 b ,2 b + 1 )} 

Proposition 2.12. The quantities h v are well-defined and the sets S 9 partition S 9 . Moreover, 
Xh-i < Xh for all h. 

Proof. It follows from the definition of / that for a probability vector p the slope of I(p, •) is 
always less than 1, and so xq > l/c§(£ + 2). If X£ < \x (V) /cq(£ + 2), then 

xe/x < ^.^f + < A* (V) /2, (by inequality ©) 
cq{£ + 2) 

so there is an integer h < £ such that Xh < 2xh-i, and so the quantities h v are well-defined. 
To see that the sets S% partition S 9 , it now suffices to observe that Xh v -i < H (V) < 2 e+1 . 
Finally, to show that x^—i < x^, we apply Lemma 12.91 to show 

I{p th ,x h -l) < /(pt^^XA-i) 



c 5 (£ + 2)' 
As I(pt h , •) is non-decreasing and 

I(pt h ,x h ) > 



c 5 (£ + 2)' 

we can conclude that Xh > x^-i- □ 
Step 3: Clustering and truncated random walks 

We now establish that vectors produced by the truncated random walk do not differ too much 
from those produced by the standard random walk. 

Lemma 2.13 (Low-impact Truncation). For all u G V and t, 

Pt(u) > qt(u) > r t (u) > p t {u) - ted(u). (20) 

For all t and x, 

I(pt, x) > I(q t , x) > I(r t ,x) > I(p t , x) - ext. (21) 
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Proof. The left-hand inequalities of (|20p are trivial. To prove the right-hand inequality of (|20p . 
we consider p t — \pt] e , observe that by definition 

and then apply Proposition 12.21 Inequality (|21|) then follows from □ 



Lemma 2.14 (Lower bound on I). Let S C F be a set of vertices such that fJ.(S) < (2/3)// (V) 
d $(S) < fi(4>), and let v lie in £ 

1. Ifx e (v) > 2m/c 6 (e + 2), then 



and $(S) < f\{4>), and let v lie in S g . Define qt by running Nibble is with parameter b. 



1 4cfi 

HqtM , {Vm2m ))>l- 1;SFTY) -^ (22) 

2. Otherwise, 

j, s . hv + 1/2 , 0Q , 

%^,^)>^ T ^ (23) 

Proof. In the case a^(f ) > 2m/c§(l + 2), we compute 
J(p t<+1 ,(2/3)(2m)) >I(p h+1 ,v(S)) 

= XsPt e+1 

> 1 — ti as tfi((f>), by the definition of S 9 

As 2 b+1 > > 2m/ce(£ + 2), we may use Lemma 12.131 to show 

1 1 4ck 

(2/3)(2m)) > 1 - — — - e(4m/3)t last = 1 - - -A 

+ C2(« + 2) C2{£ + 2) 3c3 



by USD. 

If X£(v) < 2mjc%(i + 2), we compute 



I(lt hv ,x hv )> I{Pt hv ,x hv )- eti ast x hv (by Lemma |2.13|) 



h„ + 1 

c 5 (i + 2) 


£tlast%h v 


K + 1 




c 5 (£ + 2) 


c 3 (£ + 2)2 b 


h v + 1 


2x^-1 


c 5 (^ + 2) 


c 3 (£ + 2)2 & 


K + 1 


2 fe+2 


c 5 (^ + 2) 


c 3 (£ + 2)2 6 


^ + 1/2 
c 5 (£ + 2)- 


by ©. 



(by (USD) 

- Z~n \ o\ ~ I 77!"o^oh ( as x ^ ^ 2aj ft „_ij 
> 



□ 
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Lemma 2.15 (C.4). Let S C V be a set of vertices such that n(S) < (2/3)// (V) and $(S) < 
fi((f>), and let v lie in S?. If Nibble is run with parameter b, then for all t E (th v -i, th v ], 
condition (C.4) is satisfied. 

Proof. We first consider the case in which xg < 2mjc%{£ + 2), which by definition implies 

Xh v < 2%h v -l- 

In this case, we have 

I{q u x hv -x) < I(pt,x hv -{) < Iipt^^Xhv-i) = h v /c 5 (£ + 2), 

where the first inequality follows from Lemma 12.131 and the second follows from Lemma [2? 
As I x (qt,x) is non-increasing in x and Xh v -i < x^ v < 2xh v -i, we have 

T ( . . liVuXhv) -I(q t ,x hv -i) I(q thv ,x hv ) - I(g t , a^_i) 1/2 



Xh v ~x hv -i x hv -x hv -i c 5 (£ + 2)x hv - 1 

where the second inequality follows from Lemma [2.9l and the definition of qt, and the last follows 
from ([23]) . 

If Xh v -i > 2, then b > 1 and we have 2 fe < Xh v —\ < 2 b+1 , and so 

Ix(qt,2 b ) > I x {qt,x hv -i) > 



2c 5 (£ + 2)2 b+1 

and by (|7|) condition (C.4) is satisfied. 



If Xh v -i < 2, then 6 = 0, and so I x {qt, 2 ) = ia;(<7t, x) for all x < 1, which implies 
/*(gt, 2 b ) > / x (gt, > n 7i , \ > ' 



2c 5 (e + 2)x hv -i 2c 5 (£ + 2)2 b 

and condition (C.4) is satisfied. 

If xg > 2m/cQ(£ + 2), in which case h v = £ + 1 and xg < 2 b+l , we apply Lemma 12.131 to show 
that for all t E (t(,t£ + i\, 

2m 2m 2cq 

I(q t , 2m) > I(p t , 2m) - 2met last = 1 > 1 — - > 1 . 

c 3 (£ + 2)2 b c 3 (£ + 2)xi/2 c 3 

On the other hand, 

i{qt,xi) < i{p t ,x e ) < i{p te ,xi) < i/c 5 . 

As I x (qt, ') is non-decreasing and xg > 2 b , we have 



2m — X£ 



> J_ L _ 2c6 _ n > : ^ 2C6 I 



2m V. c 3 c 5 / ~ c 6 (^ + 2)2 b + 1 V. c 3 c 5 , 

as 2 b+1 > x £ > 2m/c e (£ + 2), and so by © condition (C.4) is satisfied. □ 

It remains to show that conditions (C.l-3) are met for some t E {th v -ii^h v \- We will do this 
by showing that if at least one of these conditions fail for every j and every t E th v ], then 

the curve I(qt h , •) w ih be too low, in violation of Lemma 12.141 
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Lemma 2.16. // there exists a [3 > and an h G [1,1 + 1], such that for all t S {th-i,th\ and 
for all j either 



1. HSjigt)) > <j>, 

2. Xj{q t ) > (5/6)2m, or 

3. I(q t ,X j (qt))<P, 
then, for all x 



3x 



2\ h 



Proof. We will prove by induction that the conditions of the lemma imply that for all t € [th-i , th] 
and all x, 

3x ' 



I( gi ,x)</? + - + V^l-^j . (24) 

The base case is when t = th-i, in which case (|24p is satisfied because 

• For 1 < x < 2m — 1, I(qt, x) < I(qt, 1m) < 1 < Vx. 

• For < x < 1, we have I(qt,x) < Vx as both are at i = 0, the right-hand term 
dominates at x = 1, the left-hand term is linear in this region, and the right-hand term is 
concave. 

• For 2m — 1 < x < 2m, we note that at x = 2m, I(qt, x) = 1 < 3x/5m, and that we already 
know the right-hand term dominates at x = 2m — 1 . The inequality then follows from the 
facts that left-hand term is linear in this region, and the right-hand term is concave. 

Let 

f(x) = VS. 
Lovasz and Simonovits [LS90] observe that 

X - {f{x - 2(t>x) + f(x + 2<Px)) < f{x) (l - y) • (25) 



We now prove that (|24p holds for t, assuming it holds for t — 1, by considering three cases. As 
the right-hand side is concave and the left-hand side is piecewise-linear between points of the 
form \j{qt), it suffices to prove the inequality at the points Xj{qt)- If x = Xj{qt) and I{qt,x) < 13, 
then (|24p holds trivially. Similarly, if x = Xjiqt) > (5/6)2m, then (|24p holds trivially as well, as 
the left-hand side is at most 1, and the right hand side is at least 1. In the other cases, we have 
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$>(Sj(qt)) > 4>, in which case we may apply Lemma f2. 101 to show that for x = Xj(qt) 



I(q t ,x) = I(Mr t -i,x) 
1 



< - (I(r t - 1 ,x-24>x) +/(r t _!,x + 20x)) 

< - (I(q t -i, x -24>x) +I(q t -i,x + 2(f>x)) 



by definition 
by Lemma 12.101 



1 

K 2 



+P + 



5m 

3(x + 24>x) 
5m 



+ yjx + 2<j)X 1 



2\t-l-t h . 



by induction 



(3+ — + - [yx- 2<px + ^Jx + 2(j)xj 



5m 



<P+^ + ^x\ 1 
5m 



2 N t-l-th- 



2\ t-th-l 



by $W)- 

We now observe that t\ has been chosen to ensure 



2\ tl 



X ' 1 ~ 2 



< 



1 



ci{£ + 2)' 



□ 



(26) 



Lemma 2.17 (C.l-3). Let S be a set of vertices such that (j, (S) < (2/3) (2m) and $(5) < fx(4>), 
and let v lie in Si. If Nibble is run with parameter b, then there exists at € (th v -i,th v ] an d a 
j for which conditions (C.l-3) are satisfied. 

Proof. We first show that for t G (th v -i, th v ], (C.3) is implied by I(qt, Xj(qt)) > h v /c^(£ + 2). To 
see this, note that for b > 



as x hv -i > 2 , 
by Lemma |2,13| 
by Lemma |2.9| 



I(q u 2 b ) <I(q t ,x hv ^), 

< HPt,Xh v -l), 

< KjPtH^nXhv-x) 

K 

~ c 5 (£ + 2y 

So, I(qt, Xj(qt)) > h v /cs(l + 2) implies Xj(qt) > 2 b , and we may prove the lemma by exhibiting 
a t and j for which (C.l), (C.2) and I(q t , Xj(qt)) > h v jc^{l + 2) hold. On the other hand, if 
6 = then Xj(qt) > 1 = 2 b for all j > 1, so I(q t , Xj(qt)) > h v /c^(£ + 2) trivially implies j > 1 
and therefore (C.3). 

We will now finish the proof by contradiction: we show that if no such t and j exist, then 
the curve I(qt h , •) would be too low. If for all t £ (th v -i,th v ] and all j one of (C.l), (C.2) or 
I(qt, Xj(qt)) > h v jc<z(i + 2) fails, then Lemma \2. 161 tells us that for all x 



J (Qt hv , x) < 



h r . 



3.x 



c 5 (£ + 2) 5m 



+ Vx 1 



2\ h 



< 



h, 



3x 
+ -=— + 



1 



c 5 {£ + 2) 5m Cl (£ + 2)' 
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by inequality ([26]) . 

In the case xt < 2mjc%{l + 2), we obtain a contradiction by plugging in x = Xh v to find 

1 (K 6 1\ 
%^,^)<^(- + ^ + -J 

which by Q contradicts (f2"3"|) . 

In the case in that rc^ > 2m/cQ(£ + 2), and so /i^ = £ + 1, we substitute x 
obtain 

7( 9l , +1 ,(2/3)2 m )<i + I + ;^, 
which by ((10} contradicts ([22]). 

2.4 Proof of Theorem I2TT1 

Fact (N.l) follows from conditions (C.l) and (C.2) in the algorithm. Given a set S satisfying 
fJ-(S) < (2/3)/i(y), the lower bound on the volume of the set S 9 is established in Lemma 12.71 
If $(S) < f\(4>) and v £ S b , then Lemmas 12.171 and 12.151 show that the algorithm will output 
a non-empty set. Finally, lemma 12^81 tells us that if $>(S) < fi(4>), v £ S 9 and the algorithm 
outputs a non-empty set C, then it satisfies [i (C n 5) > 2 b_1 . 

It remains to bound the running time of Nibble. The algorithm will run for t\ as t iterations. 
We will now show that with the correct implementation, each iteration takes time 0((logn)/e). 
Instead of performing a dense vector multiplication in step (3. a), the algorithm should keep 
track of the set of vertices u at which rt(u) > 0. Call this set Vt- The set Vt can be computed 
in time 0(|T4|) in step (3.b). Given knowledge of Vt—i, the multiplication in step (3. a) can be 
performed in time proportional to 

n{V t ^)= d(u)< Y, n(u)/e<l/e. 

Finally, the computation in step (3.c) might require sorting the vectors in Vt according to rt, 
which could take time at most 0(|Vf| logn). Thus, the run-time of Nibble is bounded by 

/ logn\ f 2 b 2 \ /2 fe log 6 m\ 
O [tiast—^- =0 [t last 2 log mj =0 I — 4 I . 

3 Nearly Linear-Time Graph Partitioning 

In this section, we apply Nibble to design a partitioning algorithm Partition. This new algo- 
rithm runs in nearly linear-time. It computes an approximate sparsest cut with approximately 
optimal balance. In particular, we prove that there exists a constant a > such that for any 
graph G = (V,E) that has a cut S of sparsity a ■ 9 2 /log 3 n and balance b < 1/2, with high 
probability, Partition finds a cut D with $y(Z?) < and bal (D) > 6/2. Actually, Partition 
satisfies an even stronger guarantee: with high probability either the cut it outputs is well 
balanced, 

\v(V)<v(D)<^(V), 



= (2/3)2m to 



□ 
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or touches most of the edges touching S, 

M(£»nS)>i/x(S). 

The expected running time of Partition is 0(m log 7 n/c^> 4 ). Thus, it can be used to quickly 
find crude cuts. 

Partition calls Nibble via a routine called Random Nibble that calls Nibble with carefully 
chosen random parameters. Random Nibble has a very small expected running time, and is 
expected to remove a similarly small fraction of any set with small conductance. 



3.1 Procedure Random Nibble 


C = RandomNibble(G, (j)) 




(1) Choose a vertex v according to t/jy 




(2) Choose a b in 1, ... , [logm] according 


to 


Pr [b = i} = 


2~V(1 - 2~ri°g™l). 


(3) C = Nibble(G>,0,6). 





Lemma 3.1 (Random Nibble). Let m be the number of edges in G. The expected running time 
of Random Nibble is O (log 7 mfcf) 4 ) . If the set C output by Random Nibble is non-empty, it 
satisfies 

(R.l) $y(C) < <f>, and 
(R.2) M (C)<(5/6) M (y). 
Moreover, for every set S satisfying 

fi(S)<(2/3)n(V) and < /i(0), 

(R.3) E[/i(CnS)] > n(S) /4fi(V). 

Proof. The expected running time of Random Nibble may be upper bounded by 

([log m] \ 
J2 (2"7(1 - 2^1)) (2 l log 6 (m)/0 4 ) j = O (log 7 (m)/0 4 ) . 

Parts (R.l) and (R.2) follow directly from part (N.l) of Theorem 12.11 To prove part (R.3), 
define a& by 

ab = JWY 
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So, a b = 1- For each i, the chance that v lands in Sf is aifi(S 9 ) /fj,(V). Moreover, the 
chance that b = i is at least 2~\ If v lands in Sf , then by part (N.3) of Theorem 12.11 C satisfies 

n(cns) > 2* -1 . 

So, 

E [/i (C n 5)] > £ (/x / M (1/)) 2*" 1 

i 

j 

= M (^)/2m(V) 

> /i (5) /4// ( V) . (by part (N.2.a) of Theorem EH) 



□ 



3.2 Partition 

We now define Partition and analyze its performance. First, define 



f 2 (e) = A(0/7)/2, (27) 



and note 

f 2 (0) > n 



log m 



D = Partition(G, where G is a graph, 6>,p G (0, 1). 

(0) Set W = V,j = and = 6/7. 

(1) While j < 12m pg(l/p)l and n (Wj) > (3/4)// (V), 

(a) Set j=3 + 1- 

(b) Set D, = RandomNibble(G[Wj_i],0) 

(c) Set W} = - Dj. 

(2) Set D = D 1 U---UD j . 



Theorem 3.2 (Partition). On input a graph with m edges, the expected running time of 
Partition is O (mlg(l/p) log 7 m/# 4 ) . Let D be the output o/ Partition(G, 9,p), where G is a 
graph and 9,p £ (0, 1). Then 

(P.I) n(D)<{7/8)ti(V), 

(P.2) IfD^Q then <S> V (D) < 6, and 
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(P. 3) If S is any set satisfying 



n(S)<n(V)/2 and $y(S)</ 2 (0), (28) 

then with probability at least 1 — p, // (D) > fi (S) /2. 

In particular, with probability at least 1 — p either 

(P.S.a) //(£>) > (l/4)fi(V), or 

(P.S.b) n(SDD) >n{S)/2. 

Property (P.3) is a little unusual and deserves some explanation. It says that for every set S 
of low conductance, with high probability either D is a large fraction of S, or it is a large fraction 
of the entire graph. While we would like to pick just one of these properties and guarantee that 
it holds with high probability, this would be unreasonable: on one hand, there might be no big 
set D of small conductance; and, on the other hand, even if S is small the algorithm might cut 
out a large set D that completely avoids S. 

Proof of Theorem \3.^ The bound on the expected running time of Partition is immediate 
from the bound on the running time of RandomNibble. 

Let j ou t be the iteration at which Partition stops, so that D = A U • • • U Dj out . To prove 
(P.I), note that // (W^-i) > (3/4)/* (V) and so // (A U • • ■ U D jout ^) < (1/4)// (V). By part 
(R.2) of Lemma EU fJ,(D jovt ) < (5/6)// {W jout - X ). So, 

// (A U ■ • ■ U Dj out ) < fj, (V) — // (^i) + \n (W jout ^) = // (V) — jjr/i (W Jout ^) < ^ (V) . 
To establish (P.2), we first compute 

Jout 

\E(D,V - D)\=Y J \E(D U V - D)\ 



i=l 

Jout 



<^|s(A,Wi-i-A 



i=i 

Jout 



<^0//(A) (by (R.l)) 



i=i 



= #(£>). 

So, if /i (D) < // (V) /2, then $y (D) < <^>. On the other hand, we established above that 
// (D) < (7/8)// (F), from which it follows that 

fi{V — D) > (1/8)// (V) > (8/7)(l/8)// (£>) = (1/7)// (D) . 

So 

a, mi [E(Q,V-P)I - IE(Ay-P)| 
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Let jmax — 12m |~lg(l/p)] . To prove part (P-3), let S satisfy (|28p and consider what happens 
if we ignore the second condition in the while loop and run Partition for all potential jmax 
iterations, obtaining cuts D\,..., -Di2m["ig(i/p)] • Le-t 

D- j = Ui<jDi. 

We will prove that if neither 

A* [D- ) > — ^— nor n [S n D- J > — ^— 

hold at iteration k, then with probability at least 1/2, one of these conditions will be satisfied 
by iteration k + 12m. Thus, after all j max iterations, one of conditions (P.3.a) or (P.3.b) will be 
satisfied with probability at least 1 — p. If the algorithm runs for fewer iterations, then condition 
(P. 3. a) is satisfied. 

To simplify notation, let C, = D^+i and U{ = Wk+i, for < i < 12m. Assume that 

n(U )>^ii(y) and fi(SnUo) <^(S). 
For 1 < i < 12m, define the random variable 

As each set Cj is a subset of Uq, and the Cj are mutually disjoint, we will always have 

12m 



Define j3 to satisfy 

(i-p)v(snu ) = ~»(s), 

and note that this ensures < /3 < 1/2. Moreover, if Yj X i > A then A 4 { s n D^ k+12m ) > 
H (S) j2 will hold. ^ 



/ i(f/ i )< 7 /i(F). 



Let Ej be the event 

3 
V 

We need to show that, with probability at least 1/2, either an event Ej holds, or > (3. 

To this end, we now show that if neither Ej nor ^2i<j Xi > (3 holds, then E [-Xj+i] > l/8m. If 
V, ; .Y, < .1. then 

n (S n c/,) = /x (5 n £/ )-$> (5na) = M(^nc/ ) i-ExJ > ^ (5 n c/ ) (i-/3) = ^ (S) . 

If £j does not hold, then 

a* — s nUj) = (i (Uj) — fi(s n Uj) > (V) -ti(s)> \» (V) > \ii (S) . 
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So, 

*o iSnUj) = . fBV^U i ^# <- »c (5) < 2 hW = A(#). 

J mm (/i (d fl c/j) , /x (t/j — <S D Uj)) \l/2)fi(b) 

We also have /i(Sfl C/j) < n(S) < (2/3)// (C/j), so the conditions of part (R.3) of Lemma I37T1 
are satisfied and 

E [X j+1 ] > 1/4// (£/,-) > l/8m. 

Now, set 

y = f l/8m if E 4<i Xi > (3, or if 
° 1 Xj otherwise. 

So, for all j we have E [Yj] > l/8m, and so E ^j<i2m^j — ^/2. On the other hand, 

i<12m 

So, with probability at least 1/2, 

j<12m 

This implies that with probability at least 1/2 either > (3 or some event Ej holds, which 

is what we needed to show. □ 
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